Overview

Dataset statistics

Number of variables18
Number of observations10353
Missing cells16072
Missing cells (%)8.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 MiB
Average record size in memory542.2 B

Variable types

Categorical7
Numeric11

Alerts

Loan ID has a high cardinality: 10000 distinct values High cardinality
Customer ID has a high cardinality: 10000 distinct values High cardinality
Annual Income is highly correlated with Monthly DebtHigh correlation
Monthly Debt is highly correlated with Annual Income and 1 other fieldsHigh correlation
Number of Open Accounts is highly correlated with Maximum Open CreditHigh correlation
Number of Credit Problems is highly correlated with BankruptciesHigh correlation
Current Credit Balance is highly correlated with Monthly Debt and 1 other fieldsHigh correlation
Maximum Open Credit is highly correlated with Number of Open Accounts and 1 other fieldsHigh correlation
Bankruptcies is highly correlated with Number of Credit ProblemsHigh correlation
Annual Income is highly correlated with Monthly DebtHigh correlation
Monthly Debt is highly correlated with Annual IncomeHigh correlation
Number of Credit Problems is highly correlated with Bankruptcies and 1 other fieldsHigh correlation
Bankruptcies is highly correlated with Number of Credit ProblemsHigh correlation
Tax Liens is highly correlated with Number of Credit ProblemsHigh correlation
Number of Credit Problems is highly correlated with BankruptciesHigh correlation
Current Credit Balance is highly correlated with Maximum Open CreditHigh correlation
Maximum Open Credit is highly correlated with Current Credit BalanceHigh correlation
Bankruptcies is highly correlated with Number of Credit ProblemsHigh correlation
Annual Income is highly correlated with Monthly DebtHigh correlation
Home Ownership is highly correlated with PurposeHigh correlation
Purpose is highly correlated with Home OwnershipHigh correlation
Monthly Debt is highly correlated with Annual IncomeHigh correlation
Number of Credit Problems is highly correlated with Bankruptcies and 1 other fieldsHigh correlation
Current Credit Balance is highly correlated with Maximum Open CreditHigh correlation
Maximum Open Credit is highly correlated with Current Credit BalanceHigh correlation
Bankruptcies is highly correlated with Number of Credit ProblemsHigh correlation
Tax Liens is highly correlated with Number of Credit ProblemsHigh correlation
Loan ID has 353 (3.4%) missing values Missing
Customer ID has 353 (3.4%) missing values Missing
Current Loan Amount has 353 (3.4%) missing values Missing
Term has 353 (3.4%) missing values Missing
Credit Score has 2334 (22.5%) missing values Missing
Annual Income has 2334 (22.5%) missing values Missing
Years in current job has 780 (7.5%) missing values Missing
Home Ownership has 353 (3.4%) missing values Missing
Purpose has 353 (3.4%) missing values Missing
Monthly Debt has 353 (3.4%) missing values Missing
Years of Credit History has 353 (3.4%) missing values Missing
Months since last delinquent has 5659 (54.7%) missing values Missing
Number of Open Accounts has 353 (3.4%) missing values Missing
Number of Credit Problems has 353 (3.4%) missing values Missing
Current Credit Balance has 353 (3.4%) missing values Missing
Maximum Open Credit has 353 (3.4%) missing values Missing
Bankruptcies has 375 (3.6%) missing values Missing
Tax Liens has 354 (3.4%) missing values Missing
Maximum Open Credit is highly skewed (γ1 = 51.4146651) Skewed
Loan ID is uniformly distributed Uniform
Customer ID is uniformly distributed Uniform
Number of Credit Problems has 8653 (83.6%) zeros Zeros
Tax Liens has 9810 (94.8%) zeros Zeros

Reproduction

Analysis started2023-12-10 13:05:53.860926
Analysis finished2023-12-10 13:06:23.550914
Duration29.69 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Loan ID
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct10000
Distinct (%)100.0%
Missing353
Missing (%)3.4%
Memory size919.4 KiB
ed055a45-8724-44e3-9941-f571015e2c4a
 
1
b052ec5f-b849-44e0-8218-57c9bcebbe88
 
1
ff3136a6-7227-44f0-81df-299b3ea41c33
 
1
48b259bd-a453-4c8e-8426-2a2a9a7a7775
 
1
3940bd5e-35ed-43e0-9af7-5db097149931
 
1
Other values (9995)
9995 

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowf738779f-c726-40dc-92cf-689d73af533d
2nd row6dcc0947-164d-476c-a1de-3ae7283dde0a
3rd rowf7744d01-894b-49c3-8777-fc6431a2cff1
4th row83721ffb-b99a-4a0f-aea5-ef472a138b41
5th row08f3789f-5714-4b10-929d-e1527ab5e5a3

Common Values

ValueCountFrequency (%)
ed055a45-8724-44e3-9941-f571015e2c4a1
 
< 0.1%
b052ec5f-b849-44e0-8218-57c9bcebbe881
 
< 0.1%
ff3136a6-7227-44f0-81df-299b3ea41c331
 
< 0.1%
48b259bd-a453-4c8e-8426-2a2a9a7a77751
 
< 0.1%
3940bd5e-35ed-43e0-9af7-5db0971499311
 
< 0.1%
2d399fa6-58f0-4888-99eb-5038d5bdc5c51
 
< 0.1%
98463866-41a0-4fd5-bbad-b1721be9c8f11
 
< 0.1%
71f9269f-adf4-4da8-af89-772e3041ee8d1
 
< 0.1%
bc55e336-7fec-48c8-8c25-88c26ffddfe81
 
< 0.1%
1c7fcfe6-077d-43e6-807a-e169409430a71
 
< 0.1%
Other values (9990)9990
96.5%
(Missing)353
 
3.4%

Length

2023-12-10T13:06:23.705761image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ed055a45-8724-44e3-9941-f571015e2c4a1
 
< 0.1%
0b2f1b66-741e-4e37-a929-99926cdc9e9a1
 
< 0.1%
add946a5-20a5-4211-bf22-408525123b1d1
 
< 0.1%
f7744d01-894b-49c3-8777-fc6431a2cff11
 
< 0.1%
83721ffb-b99a-4a0f-aea5-ef472a138b411
 
< 0.1%
08f3789f-5714-4b10-929d-e1527ab5e5a31
 
< 0.1%
a4957169-d809-44cc-847b-975400bc8d111
 
< 0.1%
43467302-94fe-494b-b52f-3fd891fea71c1
 
< 0.1%
930c7cb3-6086-434a-9547-3ed41c1815521
 
< 0.1%
d08f3a5e-93df-40e7-bdd8-cba59180bddf1
 
< 0.1%
Other values (9990)9990
99.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Customer ID
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct10000
Distinct (%)100.0%
Missing353
Missing (%)3.4%
Memory size919.4 KiB
19af16fc-b2b2-44dd-8ce3-c5556b419d65
 
1
3bc77c02-e39e-4341-941b-9d4dc17b660e
 
1
fa6a4c11-7748-48cc-914c-2ea2ec84773e
 
1
9328e4e2-8488-41aa-8ef5-869b74fed3fb
 
1
b4a93297-456f-4ba5-987a-a491c783e86b
 
1
Other values (9995)
9995 

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowded0b3c3-6bf4-4091-8726-47039f2c1b90
2nd row1630e6e3-34e3-461a-8fda-09297d3140c8
3rd row2c60938b-ad2b-4702-804d-eeca43949c52
4th row12116614-2f3c-4d16-ad34-d92883718806
5th row39888105-fd5f-4023-860a-30a3e6f5ccb7

Common Values

ValueCountFrequency (%)
19af16fc-b2b2-44dd-8ce3-c5556b419d651
 
< 0.1%
3bc77c02-e39e-4341-941b-9d4dc17b660e1
 
< 0.1%
fa6a4c11-7748-48cc-914c-2ea2ec84773e1
 
< 0.1%
9328e4e2-8488-41aa-8ef5-869b74fed3fb1
 
< 0.1%
b4a93297-456f-4ba5-987a-a491c783e86b1
 
< 0.1%
8f65b9c7-bfb4-46ed-b2cd-ab8f5bc323281
 
< 0.1%
3e407115-5bd7-43a2-a524-570beebdfd9e1
 
< 0.1%
95afe69c-2f56-4493-a6cc-a937078c16311
 
< 0.1%
6c5d2350-56e0-4101-a185-58cede9810651
 
< 0.1%
ebd746ba-91b4-4985-99f7-47e4999c56181
 
< 0.1%
Other values (9990)9990
96.5%
(Missing)353
 
3.4%

Length

2023-12-10T13:06:23.909672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
19af16fc-b2b2-44dd-8ce3-c5556b419d651
 
< 0.1%
6a1adeda-079b-49e5-ac7c-91828f2806a01
 
< 0.1%
163b8125-8f24-4b8f-ba59-23ea017f5b481
 
< 0.1%
2c60938b-ad2b-4702-804d-eeca43949c521
 
< 0.1%
12116614-2f3c-4d16-ad34-d928837188061
 
< 0.1%
39888105-fd5f-4023-860a-30a3e6f5ccb71
 
< 0.1%
6878d414-6a22-4712-ae43-9b3f798e463a1
 
< 0.1%
48113a98-a4a0-4956-b57d-f0ce344826fb1
 
< 0.1%
19941661-98e2-4800-93c9-a0e92057c8131
 
< 0.1%
4080a828-a61a-4f04-a627-397f4319500c1
 
< 0.1%
Other values (9990)9990
99.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Current Loan Amount
Real number (ℝ≥0)

MISSING

Distinct6786
Distinct (%)67.9%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean11603801.21
Minimum19470
Maximum99999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:24.225406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum19470
5-th percentile73036.7
Q1178948
median309276
Q3515707.5
95-th percentile99999999
Maximum99999999
Range99980529
Interquartile range (IQR)336759.5

Descriptive statistics

Standard deviation31600097.14
Coefficient of variation (CV)2.72325392
Kurtosis3.956085669
Mean11603801.21
Median Absolute Deviation (MAD)144331
Skewness2.440286167
Sum1.160380121 × 1011
Variance9.985661393 × 1014
MonotonicityNot monotonic
2023-12-10T13:06:24.415252image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999999991133
 
10.9%
1724367
 
0.1%
1547046
 
0.1%
4425966
 
0.1%
2218926
 
0.1%
2575545
 
< 0.1%
1094725
 
< 0.1%
2230365
 
< 0.1%
2188345
 
< 0.1%
2651885
 
< 0.1%
Other values (6776)8817
85.2%
(Missing)353
 
3.4%
ValueCountFrequency (%)
194701
 
< 0.1%
214721
 
< 0.1%
215161
 
< 0.1%
215602
< 0.1%
216043
< 0.1%
216261
 
< 0.1%
216701
 
< 0.1%
216921
 
< 0.1%
218902
< 0.1%
219561
 
< 0.1%
ValueCountFrequency (%)
999999991133
10.9%
7890961
 
< 0.1%
7890303
 
< 0.1%
7889422
 
< 0.1%
7884804
 
< 0.1%
7884142
 
< 0.1%
7883261
 
< 0.1%
7882601
 
< 0.1%
7881721
 
< 0.1%
7880182
 
< 0.1%

Term
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing353
Missing (%)3.4%
Memory size662.8 KiB
Short Term
7295 
Long Term
2705 

Length

Max length10
Median length10
Mean length9.7295
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowShort Term
2nd rowShort Term
3rd rowShort Term
4th rowShort Term
5th rowShort Term

Common Values

ValueCountFrequency (%)
Short Term7295
70.5%
Long Term2705
 
26.1%
(Missing)353
 
3.4%

Length

2023-12-10T13:06:24.572048image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-12-10T13:06:24.662031image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
term10000
50.0%
short7295
36.5%
long2705
 
13.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Credit Score
Real number (ℝ≥0)

MISSING

Distinct272
Distinct (%)3.4%
Missing2334
Missing (%)22.5%
Infinite0
Infinite (%)0.0%
Mean1077.99152
Minimum585
Maximum7510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:24.761132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum585
5-th percentile662
Q1706
median725
Q3741
95-th percentile6710
Maximum7510
Range6925
Interquartile range (IQR)35

Descriptive statistics

Standard deviation1477.467761
Coefficient of variation (CV)1.370574567
Kurtosis12.9134383
Mean1077.99152
Median Absolute Deviation (MAD)17
Skewness3.855418036
Sum8644414
Variance2182910.984
MonotonicityNot monotonic
2023-12-10T13:06:24.928604image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
747197
 
1.9%
740195
 
1.9%
746177
 
1.7%
738176
 
1.7%
742176
 
1.7%
741175
 
1.7%
739164
 
1.6%
745157
 
1.5%
748157
 
1.5%
722156
 
1.5%
Other values (262)6289
60.7%
(Missing)2334
 
22.5%
ValueCountFrequency (%)
5851
 
< 0.1%
5861
 
< 0.1%
5871
 
< 0.1%
5881
 
< 0.1%
5941
 
< 0.1%
5953
< 0.1%
5962
< 0.1%
5971
 
< 0.1%
5982
< 0.1%
5992
< 0.1%
ValueCountFrequency (%)
75101
 
< 0.1%
75003
 
< 0.1%
74901
 
< 0.1%
74805
< 0.1%
74703
 
< 0.1%
74608
0.1%
74508
0.1%
74405
< 0.1%
74308
0.1%
742011
0.1%

Annual Income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct7200
Distinct (%)89.8%
Missing2334
Missing (%)22.5%
Infinite0
Infinite (%)0.0%
Mean1369106.04
Minimum81092
Maximum17815350
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:25.071129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum81092
5-th percentile520045.2
Q1848340.5
median1168272
Q31664390.5
95-th percentile2788525.5
Maximum17815350
Range17734258
Interquartile range (IQR)816050

Descriptive statistics

Standard deviation868755.7309
Coefficient of variation (CV)0.6345423259
Kurtosis42.22362967
Mean1369106.04
Median Absolute Deviation (MAD)384484
Skewness4.158243202
Sum1.097886133 × 1010
Variance7.5473652 × 1011
MonotonicityNot monotonic
2023-12-10T13:06:25.244098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9556056
 
0.1%
8538036
 
0.1%
11376065
 
< 0.1%
11586965
 
< 0.1%
14081854
 
< 0.1%
12435314
 
< 0.1%
14034924
 
< 0.1%
11374924
 
< 0.1%
13992174
 
< 0.1%
11200504
 
< 0.1%
Other values (7190)7973
77.0%
(Missing)2334
 
22.5%
ValueCountFrequency (%)
810921
< 0.1%
914851
< 0.1%
1166031
< 0.1%
1301501
< 0.1%
1521141
< 0.1%
1633051
< 0.1%
1638941
< 0.1%
1758451
< 0.1%
1819821
< 0.1%
1824381
< 0.1%
ValueCountFrequency (%)
178153501
< 0.1%
162440881
< 0.1%
125747701
< 0.1%
95893001
< 0.1%
94344501
< 0.1%
93461001
< 0.1%
93388801
< 0.1%
93366001
< 0.1%
85482901
< 0.1%
85263641
< 0.1%

Years in current job
Categorical

MISSING

Distinct11
Distinct (%)0.1%
Missing780
Missing (%)7.5%
Memory size629.0 KiB
10+ years
3085 
2 years
916 
3 years
866 
< 1 year
795 
5 years
696 
Other values (6)
3215 

Length

Max length9
Median length7
Mean length7.659876737
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10+ years
2nd row10+ years
3rd row2 years
4th row10+ years
5th row10+ years

Common Values

ValueCountFrequency (%)
10+ years3085
29.8%
2 years916
 
8.8%
3 years866
 
8.4%
< 1 year795
 
7.7%
5 years696
 
6.7%
1 year648
 
6.3%
4 years613
 
5.9%
6 years566
 
5.5%
7 years554
 
5.4%
8 years472
 
4.6%
(Missing)780
 
7.5%

Length

2023-12-10T13:06:25.401356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
years8130
40.8%
103085
 
15.5%
11443
 
7.2%
year1443
 
7.2%
2916
 
4.6%
3866
 
4.3%
795
 
4.0%
5696
 
3.5%
4613
 
3.1%
6566
 
2.8%
Other values (3)1388
 
7.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Home Ownership
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing353
Missing (%)3.4%
Memory size653.3 KiB
Home Mortgage
4867 
Rent
4203 
Own Home
914 
HaveMortgage
 
16

Length

Max length13
Median length8
Mean length8.7587
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHome Mortgage
2nd rowHome Mortgage
3rd rowRent
4th rowRent
5th rowHome Mortgage

Common Values

ValueCountFrequency (%)
Home Mortgage4867
47.0%
Rent4203
40.6%
Own Home914
 
8.8%
HaveMortgage16
 
0.2%
(Missing)353
 
3.4%

Length

2023-12-10T13:06:25.510613image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-12-10T13:06:25.620581image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
home5781
36.6%
mortgage4867
30.8%
rent4203
26.6%
own914
 
5.8%
havemortgage16
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Purpose
Categorical

HIGH CORRELATION
MISSING

Distinct16
Distinct (%)0.2%
Missing353
Missing (%)3.4%
Memory size727.8 KiB
Debt Consolidation
7878 
Home Improvements
 
593
other
 
561
Other
 
308
Business Loan
 
163
Other values (11)
 
497

Length

Max length20
Median length18
Mean length16.387
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDebt Consolidation
2nd rowDebt Consolidation
3rd rowDebt Consolidation
4th rowDebt Consolidation
5th rowDebt Consolidation

Common Values

ValueCountFrequency (%)
Debt Consolidation7878
76.1%
Home Improvements593
 
5.7%
other561
 
5.4%
Other308
 
3.0%
Business Loan163
 
1.6%
Buy a Car142
 
1.4%
Medical Bills113
 
1.1%
Buy House70
 
0.7%
major_purchase52
 
0.5%
Take a Trip44
 
0.4%
Other values (6)76
 
0.7%
(Missing)353
 
3.4%

Length

2023-12-10T13:06:25.792959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
debt7878
41.0%
consolidation7878
41.0%
other869
 
4.5%
home593
 
3.1%
improvements593
 
3.1%
buy212
 
1.1%
a186
 
1.0%
business163
 
0.8%
loan163
 
0.8%
car142
 
0.7%
Other values (13)526
 
2.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Monthly Debt
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9729
Distinct (%)97.3%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean18429.6717
Minimum0
Maximum229057.92
Zeros8
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:26.002992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3585.585
Q110202.8575
median16052.055
Q323881.3375
95-th percentile40596.6825
Maximum229057.92
Range229057.92
Interquartile range (IQR)13678.48

Descriptive statistics

Standard deviation12399.95619
Coefficient of variation (CV)0.6728256691
Kurtosis14.8449022
Mean18429.6717
Median Absolute Deviation (MAD)6614.185
Skewness2.245736435
Sum184296717
Variance153758913.6
MonotonicityNot monotonic
2023-12-10T13:06:26.161255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
08
 
0.1%
13182.393
 
< 0.1%
15977.673
 
< 0.1%
14432.43
 
< 0.1%
12626.453
 
< 0.1%
13395.383
 
< 0.1%
12907.083
 
< 0.1%
18710.253
 
< 0.1%
15324.453
 
< 0.1%
14934.952
 
< 0.1%
Other values (9719)9966
96.3%
(Missing)353
 
3.4%
ValueCountFrequency (%)
08
0.1%
113.621
 
< 0.1%
190.191
 
< 0.1%
191.521
 
< 0.1%
278.921
 
< 0.1%
281.961
 
< 0.1%
286.521
 
< 0.1%
288.421
 
< 0.1%
288.611
 
< 0.1%
292.221
 
< 0.1%
ValueCountFrequency (%)
229057.921
< 0.1%
143526.571
< 0.1%
139664.821
< 0.1%
114938.981
< 0.1%
111289.841
< 0.1%
109791.51
< 0.1%
103778.951
< 0.1%
101813.971
< 0.1%
97996.491
< 0.1%
97150.421
< 0.1%

Years of Credit History
Real number (ℝ≥0)

MISSING

Distinct424
Distinct (%)4.2%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean18.23593
Minimum3.8
Maximum62.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:26.303083image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum3.8
5-th percentile8.9
Q113.6
median17
Q321.7
95-th percentile31.7
Maximum62.5
Range58.7
Interquartile range (IQR)8.1

Descriptive statistics

Standard deviation7.018355774
Coefficient of variation (CV)0.3848641541
Kurtosis1.748104942
Mean18.23593
Median Absolute Deviation (MAD)4
Skewness1.071869791
Sum182359.3
Variance49.25731777
MonotonicityNot monotonic
2023-12-10T13:06:26.448303image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16141
 
1.4%
14130
 
1.3%
17127
 
1.2%
15.4125
 
1.2%
16.5121
 
1.2%
13116
 
1.1%
15113
 
1.1%
14.5104
 
1.0%
18.596
 
0.9%
1292
 
0.9%
Other values (414)8835
85.3%
(Missing)353
 
3.4%
ValueCountFrequency (%)
3.81
 
< 0.1%
41
 
< 0.1%
4.11
 
< 0.1%
4.32
 
< 0.1%
4.51
 
< 0.1%
4.73
< 0.1%
4.84
< 0.1%
4.92
 
< 0.1%
57
0.1%
5.14
< 0.1%
ValueCountFrequency (%)
62.51
< 0.1%
57.51
< 0.1%
52.51
< 0.1%
51.81
< 0.1%
50.91
< 0.1%
501
< 0.1%
49.91
< 0.1%
49.41
< 0.1%
49.21
< 0.1%
492
< 0.1%

Months since last delinquent
Real number (ℝ≥0)

MISSING

Distinct89
Distinct (%)1.9%
Missing5659
Missing (%)54.7%
Infinite0
Infinite (%)0.0%
Mean34.96463571
Minimum0
Maximum131
Zeros17
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:26.605576image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q117
median32
Q350
95-th percentile75
Maximum131
Range131
Interquartile range (IQR)33

Descriptive statistics

Standard deviation21.64029066
Coefficient of variation (CV)0.6189193801
Kurtosis-0.7471571255
Mean34.96463571
Median Absolute Deviation (MAD)16
Skewness0.4462521184
Sum164124
Variance468.3021797
MonotonicityNot monotonic
2023-12-10T13:06:26.762230image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15109
 
1.1%
17103
 
1.0%
994
 
0.9%
1292
 
0.9%
2390
 
0.9%
887
 
0.8%
1986
 
0.8%
3886
 
0.8%
685
 
0.8%
1185
 
0.8%
Other values (79)3777
36.5%
(Missing)5659
54.7%
ValueCountFrequency (%)
017
 
0.2%
119
 
0.2%
241
0.4%
346
0.4%
444
0.4%
563
0.6%
685
0.8%
770
0.7%
887
0.8%
994
0.9%
ValueCountFrequency (%)
1311
 
< 0.1%
1071
 
< 0.1%
881
 
< 0.1%
872
 
< 0.1%
861
 
< 0.1%
832
 
< 0.1%
8216
0.2%
8129
0.3%
8034
0.3%
7937
0.4%

Number of Open Accounts
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct45
Distinct (%)0.4%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean11.0841
Minimum1
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:27.013795image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q17
median10
Q314
95-th percentile20
Maximum55
Range54
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.023380398
Coefficient of variation (CV)0.4532059796
Kurtosis3.183041041
Mean11.0841
Median Absolute Deviation (MAD)3
Skewness1.237947558
Sum110841
Variance25.23435063
MonotonicityNot monotonic
2023-12-10T13:06:27.172245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
10968
 
9.3%
8913
 
8.8%
9898
 
8.7%
7883
 
8.5%
11846
 
8.2%
6711
 
6.9%
12681
 
6.6%
13546
 
5.3%
14518
 
5.0%
5455
 
4.4%
Other values (35)2581
24.9%
ValueCountFrequency (%)
12
 
< 0.1%
235
 
0.3%
3125
 
1.2%
4291
 
2.8%
5455
4.4%
6711
6.9%
7883
8.5%
8913
8.8%
9898
8.7%
10968
9.3%
ValueCountFrequency (%)
551
 
< 0.1%
472
< 0.1%
432
< 0.1%
421
 
< 0.1%
411
 
< 0.1%
401
 
< 0.1%
392
< 0.1%
381
 
< 0.1%
372
< 0.1%
364
< 0.1%

Number of Credit Problems
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct9
Distinct (%)0.1%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean0.1655
Minimum0
Maximum10
Zeros8653
Zeros (%)83.6%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:27.282123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5009339712
Coefficient of variation (CV)3.026791367
Kurtosis62.05829572
Mean0.1655
Median Absolute Deviation (MAD)0
Skewness5.704018402
Sum1655
Variance0.2509348435
MonotonicityNot monotonic
2023-12-10T13:06:27.376435image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
08653
83.6%
11158
 
11.2%
2127
 
1.2%
338
 
0.4%
410
 
0.1%
58
 
0.1%
93
 
< 0.1%
62
 
< 0.1%
101
 
< 0.1%
(Missing)353
 
3.4%
ValueCountFrequency (%)
08653
83.6%
11158
 
11.2%
2127
 
1.2%
338
 
0.4%
410
 
0.1%
58
 
0.1%
62
 
< 0.1%
93
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
101
 
< 0.1%
93
 
< 0.1%
62
 
< 0.1%
58
 
0.1%
410
 
0.1%
338
 
0.4%
2127
 
1.2%
11158
 
11.2%
08653
83.6%

Current Credit Balance
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8430
Distinct (%)84.3%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean290730.0637
Minimum0
Maximum16237438
Zeros55
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:27.504565image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30133.05
Q1108651.5
median207518
Q3362463
95-th percentile755940.65
Maximum16237438
Range16237438
Interquartile range (IQR)253811.5

Descriptive statistics

Standard deviation388168.6782
Coefficient of variation (CV)1.335151492
Kurtosis458.0452274
Mean290730.0637
Median Absolute Deviation (MAD)115681.5
Skewness14.87107845
Sum2907300637
Variance1.506749227 × 1011
MonotonicityNot monotonic
2023-12-10T13:06:27.722613image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
055
 
0.5%
1068185
 
< 0.1%
1718365
 
< 0.1%
2082974
 
< 0.1%
1442104
 
< 0.1%
763044
 
< 0.1%
808454
 
< 0.1%
2294064
 
< 0.1%
1515634
 
< 0.1%
2521114
 
< 0.1%
Other values (8420)9907
95.7%
(Missing)353
 
3.4%
ValueCountFrequency (%)
055
0.5%
381
 
< 0.1%
761
 
< 0.1%
1143
 
< 0.1%
2091
 
< 0.1%
2471
 
< 0.1%
3421
 
< 0.1%
3611
 
< 0.1%
3801
 
< 0.1%
4181
 
< 0.1%
ValueCountFrequency (%)
162374381
< 0.1%
117964351
< 0.1%
115764911
< 0.1%
71110731
< 0.1%
53841061
< 0.1%
48827911
< 0.1%
47786711
< 0.1%
42521431
< 0.1%
40910421
< 0.1%
39643501
< 0.1%

Maximum Open Credit
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct9064
Distinct (%)90.6%
Missing353
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean687130.7806
Minimum0
Maximum145907344
Zeros62
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:27.951502image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile111362.9
Q1270600
median462605
Q3786115
95-th percentile1641385.9
Maximum145907344
Range145907344
Interquartile range (IQR)515515

Descriptive statistics

Standard deviation1861394.4
Coefficient of variation (CV)2.708937589
Kurtosis3768.372941
Mean687130.7806
Median Absolute Deviation (MAD)229449
Skewness51.4146651
Sum6871307806
Variance3.464789114 × 1012
MonotonicityNot monotonic
2023-12-10T13:06:28.109705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
062
 
0.6%
5948364
 
< 0.1%
4522324
 
< 0.1%
2252584
 
< 0.1%
3121144
 
< 0.1%
2176023
 
< 0.1%
6226003
 
< 0.1%
4229503
 
< 0.1%
3161403
 
< 0.1%
4799523
 
< 0.1%
Other values (9054)9907
95.7%
(Missing)353
 
3.4%
ValueCountFrequency (%)
062
0.6%
53901
 
< 0.1%
66441
 
< 0.1%
87121
 
< 0.1%
88661
 
< 0.1%
108461
 
< 0.1%
110662
 
< 0.1%
112641
 
< 0.1%
129581
 
< 0.1%
131341
 
< 0.1%
ValueCountFrequency (%)
1459073441
< 0.1%
450420301
< 0.1%
375274241
< 0.1%
251488821
< 0.1%
245827121
< 0.1%
224542781
< 0.1%
201726801
< 0.1%
201082201
< 0.1%
191947581
< 0.1%
182878521
< 0.1%

Bankruptcies
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)0.1%
Missing375
Missing (%)3.6%
Memory size599.4 KiB
0.0
8895 
1.0
1022 
2.0
 
46
3.0
 
14
5.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.08895
85.9%
1.01022
 
9.9%
2.046
 
0.4%
3.014
 
0.1%
5.01
 
< 0.1%
(Missing)375
 
3.6%

Length

2023-12-10T13:06:28.251215image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-12-10T13:06:28.329887image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.08895
89.1%
1.01022
 
10.2%
2.046
 
0.5%
3.014
 
0.1%
5.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Tax Liens
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct8
Distinct (%)0.1%
Missing354
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean0.03080308031
Minimum0
Maximum9
Zeros9810
Zeros (%)94.8%
Negative0
Negative (%)0.0%
Memory size81.0 KiB
2023-12-10T13:06:28.408520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2882149869
Coefficient of variation (CV)9.356693683
Kurtosis371.2997812
Mean0.03080308031
Median Absolute Deviation (MAD)0
Skewness16.28358748
Sum308
Variance0.0830678787
MonotonicityNot monotonic
2023-12-10T13:06:28.565817image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
09810
94.8%
1131
 
1.3%
231
 
0.3%
313
 
0.1%
48
 
0.1%
52
 
< 0.1%
82
 
< 0.1%
92
 
< 0.1%
(Missing)354
 
3.4%
ValueCountFrequency (%)
09810
94.8%
1131
 
1.3%
231
 
0.3%
313
 
0.1%
48
 
0.1%
52
 
< 0.1%
82
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
92
 
< 0.1%
82
 
< 0.1%
52
 
< 0.1%
48
 
0.1%
313
 
0.1%
231
 
0.3%
1131
 
1.3%
09810
94.8%

Interactions

2023-12-10T13:06:19.614933image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:56.860518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:59.529664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.399819image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.300742image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.214598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.069731image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.862597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:11.905946image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:15.580029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:17.618811image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:19.740558image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:57.245394image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:59.724986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.554045image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.485414image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.385336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.248587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:09.061617image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:12.194310image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:15.787792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:17.787112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:19.906265image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:57.552564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:59.896245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.701369image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.668580image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.540048image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.403176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:09.289141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:12.534158image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:15.977214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:17.924536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.071046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:57.845497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.063836image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.850107image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.802292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.702118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.587807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:09.512118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:13.098841image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.155456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.090889image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.214150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:58.062340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.224044image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.013935image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.966966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.831895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.700727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:09.752990image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:13.704216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.316820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.241886image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.358621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:58.269309image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.346509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.241337image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:04.145431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.968932image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:07.855596image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:09.994632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:14.047814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.452888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.397597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.478902image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:58.468245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.520251image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.381802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:04.285541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:06.132056image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.019273image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:10.236784image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:14.315899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.637135image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.537292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.604262image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:58.712371image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.729908image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.516170image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:04.414675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:06.308943image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.195747image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:10.501096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:14.574714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.786461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.715691image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.724866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:58.936966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:00.900481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.701118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:04.568463image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:06.428850image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.372186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:10.751657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:14.811997image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:16.985957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:18.981804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:20.862961image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:59.145277image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.065539image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:02.934870image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:04.820050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:06.560379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.521878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:11.216526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:15.123324image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:17.223497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:19.264059image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:21.040557image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:05:59.358778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:01.243145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:03.147694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:05.052404image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:06.875622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:08.650346image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:11.575832image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:15.389388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:17.421703image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-12-10T13:06:19.458774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2023-12-10T13:06:28.723046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-12-10T13:06:29.108322image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-12-10T13:06:29.509116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-12-10T13:06:30.010000image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-12-10T13:06:30.264093image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-12-10T13:06:21.457453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T13:06:22.139500image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-10T13:06:22.753856image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-12-10T13:06:23.354182image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Loan IDCustomer IDCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
0f738779f-c726-40dc-92cf-689d73af533dded0b3c3-6bf4-4091-8726-47039f2c1b90611314.0Short Term747.02074116.010+ yearsHome MortgageDebt Consolidation42000.8321.8NaN9.00.0621908.01058970.00.00.0
16dcc0947-164d-476c-a1de-3ae7283dde0a1630e6e3-34e3-461a-8fda-09297d3140c8266662.0Short Term734.01919190.010+ yearsHome MortgageDebt Consolidation36624.4019.4NaN11.00.0679573.0904442.00.00.0
2f7744d01-894b-49c3-8777-fc6431a2cff12c60938b-ad2b-4702-804d-eeca43949c52153494.0Short Term709.0871112.02 yearsRentDebt Consolidation8391.7312.510.010.00.038532.0388036.00.00.0
383721ffb-b99a-4a0f-aea5-ef472a138b4112116614-2f3c-4d16-ad34-d92883718806176242.0Short Term727.0780083.010+ yearsRentDebt Consolidation16771.8716.527.016.01.0156940.0531322.01.00.0
408f3789f-5714-4b10-929d-e1527ab5e5a339888105-fd5f-4023-860a-30a3e6f5ccb7321992.0Short Term744.01761148.010+ yearsHome MortgageDebt Consolidation39478.7726.044.014.00.0359765.0468072.00.00.0
5a4957169-d809-44cc-847b-975400bc8d116878d414-6a22-4712-ae43-9b3f798e463a202928.0Short Term741.0760380.01 yearRentDebt Consolidation6526.6913.8NaN6.00.0258647.0476872.00.00.0
643467302-94fe-494b-b52f-3fd891fea71c48113a98-a4a0-4956-b57d-f0ce344826fb621786.0Long Term733.01783606.010+ yearsHome MortgageDebt Consolidation36563.9815.3NaN42.00.0281599.01449162.00.00.0
7930c7cb3-6086-434a-9547-3ed41c18155219941661-98e2-4800-93c9-a0e92057c813266794.0Long TermNaNNaN< 1 yearOwn HomeDebt Consolidation12336.895.8NaN9.00.0233206.0342232.00.00.0
80b2f1b66-741e-4e37-a929-99926cdc9e9a6a1adeda-079b-49e5-ac7c-91828f2806a0202466.0Short Term736.01068617.05 yearsRentDebt Consolidation18745.2120.5NaN2.00.00.00.00.00.0
9d08f3a5e-93df-40e7-bdd8-cba59180bddf4080a828-a61a-4f04-a627-397f4319500c266288.0Long Term683.02031518.02 yearsRentDebt Consolidation12443.1024.456.08.02.031445.0251130.02.00.0

Last rows

Loan IDCustomer IDCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
10343NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10344NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10345NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10346NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10347NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10348NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10349NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10350NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10351NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10352NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN